Why Implementation Matters: Evaluation of an Open-source Constraint Grammar Parser

نویسندگان

  • Dávid Márk Nemeskey
  • Francis M. Tyers
  • Mans Hulden
چکیده

In recent years, the problem of finite-state constraint grammar (CG) parsing has received renewed attention. Several compilers have been proposed to convert CG rules to finite-state transducers. While these formalisms serve their purpose as proofs of the concept, the performance of the generated transducers lags behind other CG implementations and taggers. In this paper, we argue that the fault lies with using generic finite-state libraries, and not with the formalisms themselves. We present an open-source implementation that capitalises on the characteristics of CG rule application to improve execution time. On smaller grammars our implementation achieves performance comparable to the current open-source state of the art.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

GraWiTas: a Grammar-based Wikipedia Talk Page Parser

Wikipedia offers researchers unique insights into the collaboration and communication patterns of a large self-regulating community of editors. The main medium of direct communication between editors of an article is the article’s talk page. However, a talk page file is unstructured and therefore difficult to analyse automatically. A few parsers exist that enable its transformation into a struc...

متن کامل

A Dependency Constraint Grammar for Esperanto

This paper presents a rule-based formalism for dependency annotation within the Constraint Grammar framework, implemented as an extension of the open source CG3 compiler. As a proof of concept we have constructed a complete dependency grammar for Esperanto, building on morphosyntactically annotated input from the EspGram parser. The system is described and evaluated on a test corpus. With a 4% ...

متن کامل

Using An Open-Source Unification-Based System For CL/NLP Teaching

We demonstrate the open-source LKB system which has been used to teach the fundamentals of constraint-based grammar development to several groups of students. 1 Overview of the LKB system The LKB system is a grammar development environment that is distributed as part of the open source LinGO tools (http://wwwcsli.stanford.edu/ ̃aac/lkb.html and http://lingo.stanford.edu, see also Copestake and F...

متن کامل

HaG— A Computational Grammar of Hausa

In this paper, I shall give an overview of HaG (=Hausa Grammar), an emerging computational grammar of Hausa1, developed within the framework of Head-driven Phrase Structure (Pollard and Sag, 1987, 1994; Sag, 1997). Since HPSG is an integrated theory of syntax and semantics, meaning representations are built up in tandem with syntactic analysis. Semantics in HaG are represented using Minimal Rec...

متن کامل

Logical Grammars Based on Constraint Handling Rules

A grammar formalism called GHRG based on CHR is proposed analogously to the way Definite Clause Grammars are defined and implemented on top of Prolog. A CHRG executes as a robust bottom-up parser with an inherent treatment of ambiguity. The rules of a CHRG may refer to grammar symbols on either side of a sequence to be matched and this provides a powerful way to let parsing and attribute evalua...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014